p16INK4A flow cytometry of exfoliated cervical cells: Its role in quantitative pathology and clinical diagnosis of squamous intraepithelial lesions

Abstract Background P16INK4A is a surrogate signature compensating for the specificity and/or sensitivity deficiencies of the human papillomavirus (HPV) DNA and Papanicolaou smear (Pap) co‐test for detecting high‐grade cervical squamous intraepithelial lesions or worse (HSIL+). However, traditional p16INK4A immunostaining is labour intensive and skill demanding, and subjective biases cannot be avoided. Herein, we created a high‐throughput, quantitative diagnostic device, p16INK4A flow cytometry (FCM) and assessed its performances in cervical cancer screening and prevention. Methods P16INK4A FCM was built upon a novel antibody clone and a series of positive and negative (p16INK4A‐knockout) standards. Since 2018, 24 100‐women (HPV‐positive/‐negative, Pap‐normal/‐abnormal) have been enrolled nationwide for two‐tier validation work. In cross‐sectional studies, age‐ and viral genotype‐dependent expression of p16INK4A was investigated, and optimal diagnostic parameter cut‐offs (using colposcopy and biopsy as a gold standard) were obtained. In cohort studies, the 2‐year prognostic values of p16INK4A were investigated with other risk factors by multivariate regression analyses in three cervicopathological conditions: HPV‐positive Pap‐normal, Pap‐abnormal biopsy‐negative and biopsy‐confirmed LSIL. Results P16INK4A FCM detected a minimal ratio of 0.01% positive cells. The p16INK4A‐positive ratio was 13.9 ± 1.8% among HPV‐negative NILM women and peaked at the ages of 40–49 years; after HPV infection, the ratio increased to 15.1 ± 1.6%, varying with the carcinogenesis of the viral genotype. Further increments were found in women with neoplastic lesions (HPV‐negative: 17.7 ± 5.0–21.4 ± 7.2%; HPV‐positive: 18.0 ± 5.2–20.0 ± 9.9%). Extremely low expression of p16INK4A was observed in women with HSILs. As the HPV‐combined double‐cut‐off‐ratio criterion was adopted, a Youden's index of 0.78 was obtained, which was significantly higher than that (0.72) of the HPV and Pap co‐test. The p16INK4A‐abnormal situation was an independent HSIL+ risk factor for 2‐year outcomes in all three cervicopathological conditions investigated (hazard ratios: 4.3–7.2). Conclusions FCM‐based p16INK4A quantification offers a better choice for conveniently and precisely monitoring the occurrence of HSIL+ and directing risk‐stratification‐based interventions.


Graphical Abstract
• The quantification detection system-p16 INK4A FCM was established for triaging cervical high-grade neoplasia/cancer. • 24 100-women cross-sectional studies revealed that HSIL+-triaging efficacy of HPV and p16 INK4A FCM co-test is significantly higher than that of HPV and Pap co-test. • 2-year cohort studies indicated that HPV-positive/Pap-abnormal p16 INK4Aabnormal women would run a HSIL+-risk > 4-fold higher than those with a p16 INK4A -normal level.

BACKGROUND
Cervical lesions caused by human papillomavirus (HPV) undergo a series of molecular events to eventually develop into invasive cancers; through this process, the affected cells acquire unrestricted proliferation and metastasis features. [1][2][3] The remarkable events, in order of occurrence, include the expression of viral E6 and E7 oncoproteins, defunction and degradation of the host tumour suppressors p53 and RB, disinhibited expression of a minor tumour suppressor -p16 INK4A , and finally, reactivated expression of Ki-67 and hTERT, which are two necessary components of genome DNA replication. [4][5][6] Current cervical cancermonitoring techniques, such as the HPV DNA test and Papanicolaou smear (Pap), only inform clinicians of the onset of HPV infection and the cytological outcome of infection-induced molecular events, 7 while the latter, that is, the Pap test, which provides a cytopathological diagnosis, is currently categorised by the Bethesda System (TBS) terms, namely, negative for intraepithelial lesion or malignancy (NILM), atypical squamous cells of undetermined significance/cannot exclude a high-grade lesion (ASC-US/ASC-H), low-grade/high-grade squamous intraepithelial lesion (LSIL/HSIL), squamous cell carcinoma (SCC) and so on. 8 To date, there is a lack of tools to probe HPV-induced intracellular molecular alterations before the formation of pathomorphological lesions. Nevertheless, p16 INK4A overexpression has been confirmed to be a key event in cervical carcinogenesis and is a unique signature for estimating viral E6-and E7-induced p53 and RB defunctions. 4,9,10 However, although the p16 INK4A monoor p16 INK4A /Ki-67 dual-immunostaining techniques have been widely applied by pathologists to recognise precancerous/cancerous lesions and have achieved significant success in cervical pathology, [11][12][13][14][15][16] detection tools for intimately monitoring p16 INK4A dysregulation events as well as for real-time assessment of cervical pre-cancer/cancer risks are not yet available.
The flow cytometric (FCM) technique offers a highthroughput, quantitative and automatic platform for determining the absolute number (or percentage) of abnormal cells. 17 This technique has been successfully used in the immunophenotyping and counting of a specific group/histotype of peripheral blood cells (e.g., leukaemia). 18,19 However, unlike the cytotype-specific expression pattern of clusters of differentiation (CD proteins) in blood cells, 19,20 the viral E6-/E7-induced dysregulated expression of p16 INK4A (e.g., overexpression) continuously changes from extremely low levels to excessively high levels, which spans a wide profile of pathological conditions, namely, normal, inflammatory, pre-cancerous (e.g., ASC, LSIL, ASC-H and HSIL) and cancerous (e.g., invasive cancer) conditions. 10 Despite an increasing overexpression trend of p16 INK4A in the high-grade lesion and invasive cancer conditions, there are no substantial gaps or nicks visible in the cell number versus p16 INK4A -signal intensity two-dimensional curve, which might be used as a signature to divide abnormal/neoplastic cells from normal cells during FCM analysis 21 ; conversely, for other histotypes of cells, such as leukaemia cells, these gaps can be easily observed and used to discriminate neoplastic cells from non-neoplastic cells, thereby dividing them into two independent groups. 19,20 Moreover, due to a scarce number (1/10 000-10/10 000) of neoplastic (pre-cancer/cancer) cells, true p16 INK4A -positive cells can be submerged by a number of mistakenly labelled pseudo/autofluorescent cells, and this comprises a major source of background noise in FCM. 21 Other reasons for false-positive signals include unstable photon-electron transformation during the FCM detection process and/or physical-physiological variations/changes in the collected cells, such as extreme cell sizes and cell-to-cell adhesions. [21][22][23] In recent years, we have sought to construct a reliable detection system to quantitatively profile the p16 INK4A expression status in cervical epithelial cells. Based on the p16 INK4A quantification system developed, which was modified from a traditional FCM platform, most random measurement errors, such as those caused by the optoelectronic fluctuations of FCM signal-processing or biophysical heterogeneity of the exfoliated cells, were satisfactorily controlled by a parallelly prepared reference sample (blank-labelled); additionally, the systemic errors caused during sample-harvesting/labelling processes and/or FCM read in/readout procedures were rectified by a set of external standards (a calibrating ladder). With these refinements, the utility of FCM in detecting abnormal cells of extremely low abundance (1/10 000-10/10 000) that were labelled with a difficult-to-measure dose of fluorescence signals was enabled. Herein, we report the diagnostic performance of this novel system. The physiological and pathological expression patterns of p16 INK4A in normal/abnormal cervical cells were quantitatively investigated. The age-, HPV genotype-and lesiondependent expression of p16 INK4A was precisely appraised, and the use of the p16 INK4A -positive ratio for assessing the risks of high-grade lesions was validated. The obtained FCM data provided important evidence for the applicability of p16 INK4A -based quantitative pathology in the risk stratification-oriented management of women threatened with neoplastic lesions. 16,24,25 In general, the current study aimed to address the critical questions that might be encountered during clinical application of p16 INK4A FCM, namely, (1) why was the 'p16 INK4A -positive ratio' selected as a more sensitive measurement scale for p16 INK4A FCM; (2) what is the normal expression pattern of p16 INK4A at the cervical epithelium; (3) what changes of cervical p16 INK4A expression pattern could be detected by FCM after HPV infection; (4) how is the diagnostic performance of p16 INK4A FCM in identifying histological HSIL+ lesions; (5) could the 'p16 INK4A -positive ratio' be applied to predict the 2-year pre-cancer/cancer risks of women with abnormal HPV/Pap tests, especially under complex pathological conditions.

Study population
The study population comprised women who attended an annual gynaecological examination (including a cervical cancer-screening program) between 20 September 2018 and 20 March 2020, at 10 local medical centres of mainland The enrolment process contained three phases ( Figure  S1). For phase 1, 20 September 2018 to 20 March 2019, according to their HPV DNA and Pap testing results, HPVnegative Pap-normal women were consecutively enrolled from participating medical centres; signed informed consent was collected; clinical information was recorded; and cervical exfoliate samples (the remnant collection of Pap) were stored for later application of p16 INK4A FCM. At each medical centre, eligible HPV-positive and/or Papabnormal women who attended during the same period were systematically enrolled as a candidate pool, and the information and cervical samples of these women were properly kept until phase 2 or 3. During phase 2, after the age compositions of the enrolled HPV-negative cervical healthy women were determined, age-matched HPVpositive women who had normal Pap results were selected from the candidate pool and called back to participate in our observational study; their informed consent was obtained before the first round of follow-up. In addition, the candidate pool was continuously expanded by collecting eligible HPV-positive and/or Pap-abnormal women during the same phase. Phase 2 enrolment ceased on 20 September 2019, and the study quickly entered phase 3 as the HPV genotyping process had been completed for each of the phase 2 women (i.e., the age-matched HPV-positive Pap-normal women). In phase 3, HPV-positive/-negative Pap-abnormal women were selected from the pool based on the following rules: (1) the enrolled HPV-negative Pap-abnormal women should comprise a group with the same/similar age composition as that of the group of HPV-negative Pap-normal women; (2) the enrolled HPVpositive Pap-abnormal women should comprise a group with the same/similar age composition and HPV genotype composition as those of the group of HPV-positive Papnormal women; (3) the women were randomly selected and called to determine their willingness to participate if they met the age and HPV genotype requirements; informed consent was obtained from the enrolled women before the first round of follow-up; and (4) if there were still vacancies regarding a specific age and/or HPV genotype condition that could not be fulfilled by the pool, the candidates were selected and enrolled directly from outpatient departments of the participating centres before 20 March 2020. Upon the enrolment deadline, the remnant and redundant candidates in the pool were released, and their information and cervical samples were deleted or discarded under surveillance. The study protocol ( Figure S2) was approved by the ethics committee of Ren Ji Hospital and approved by the participating medical centres.

Antibodies and cell lines
The human p16 INK4A -specific monoclonal antibodies used for FCM purposes were prepared from a set of novel hybridomas (clones 1H1087, 1A72 and 1B517). There were no clones for FCM purposes commercially available to us before this work (even at the time of publication).
To build anti-human p16 INK4A antibody hybridomas, the CDKN2A cDNA sequence (NM 000077.4) was synthesised and inserted into the prokaryotic expression vector pET-28a, which was then transferred into Escherichia coli strain BL21 to produce the full-length p16 INK4A antigen. The E. coli-expressed product was purified, concentrated and applied to immunise BALB/c mice. The obtained novel hybridoma clones were sequentially tested for their immune reactivity and specificity against human p16 INK4A protein by ELISA and Western blotting ( Figure S3). HeLa cells (ATCC, Manassas, VA, USA) were used as a p16 INK4Apositive standard in Western blotting as well as in FCM and were cultured in Dulbecco's modified Eagle medium supplemented with 15% foetal bovine serum in 5% CO 2 at 37 • C. The cell strain used as the p16 INK4A -negative standard was constructed by knocking the CDKN2A gene out of HeLa cells, which was implemented through the CRISPR-Cas9 technique. 26 The obtained Δ(CDKN2A) HeLa strain was named HeLa16.

Western blotting
Whole-cell protein extracts of HeLa and HeLa16 cells were separated in 10% SDS-PAGE gels and transferred to polyvinylidene fluoride (PVDF) membranes (GE Healthcare, Piscataway, NJ, USA was used for testing, and a total of 10 000 events were recorded for each run of the sample. The p16 INK4A -positive ratio was defined as the ratio of cells in the test portion of the sample with FITC signals stronger than a reference cut-off. The reference cut-off was defined as an FITC signal intensity equal to the lower limit of signal intensity of the top 10, 1 or 0.1% autofluorescent/falsely labelled cells. The FITC signal intensity data of each run were exported into Excel tables (Microsoft, Redmond, WA, USA) to calculate p16 INK4A -positive ratios at a given reference cut-off. The cellular mixtures with HeLa/(HeLa+HeLa16) ratios of 0 (i.e., the negative standard), 1, 5, 25, 50 and 100% (i.e., the positive standard) were used as a ladder that was tested in parallel to calibrate the p16 INK4A FCM result in each run ( Figure S4).

HPV DNA testing and genotyping
The details of the HPV DNA and genotyping tests have been described previously 27 with a few modifications. Briefly, for each woman, the total DNA was isolated from a 5 mL liquid-based cervical cytological sample (collected by cervical brushing) using a QIAamp DNA Mini Kit (Qiagen, Shenzhen, Guangdong, China) and maintained in PBS at −20 • C. The quality of the isolated DNA was tested by measuring the β-actin copies. The consensus HPV L1 primer pair MY09/11 was adopted to amplify viral genomic DNA via the polymerase chain reaction (PCR). 28 The resultant PCR products were used to construct a DNA fragment library and subjected to next-generation sequencing on a NovaSeq6000 platform (

Cervical cytology
Cervical cytological samples were collected in Thin-Prep fixative (Hologic, Marlborough, MA, USA) for the liquid-based Pap test. 29 ThinPrep slides were prepared, stained and processed by using the ThinPrep 2000 System (Hologic). According to TBS 2001, the diagnostic terms on Pap smears were classified into categories as follows: NILM, ASC-US, LSIL, ASC-H, HSIL and SCC. All Pap slides were independently analysed by two pathologists. Discrepant diagnoses were reviewed for consensus. The remnant cervical sample of each woman after the preparation of a liquid-based Pap slide was further used for p16 INK4A FCM (10 mL) and HPV DNA testing (5 mL), that is, approximately 15 mL in total.

Follow-up
HPV-positive and/or Pap-abnormal women were followed up regularly at a 3-month interval for 2 years. Women with HPV infections or cytologically diagnosed with ASC-US or worse lesions (i.e., ≥ASC-US) were referred to colposcopy before initiating a regular follow-up schedule. If their biopsy results indicated an intraepithelial squamous lesion lower than HSIL (e.g., cervicitis, LSIL), the women were then involved in one of the three follow-up cohorts, namely, HPV-positive Pap-normal, Pap-abnormal biopsynegative and biopsy-confirmed LSIL, based on their initial HPV, cytological and biopsy diagnoses. During followup, the women were offered regular HPV DNA and Pap tests. If their regular Pap results revealed a persistent LSIL lesion over 1 year or a lesion worse than their initial cytological/biopsy diagnoses, the women were referred to colposcopy again. For women whose HPV DNA test revealed evidence of an infection with a novel viral genotype(s) or a relapse of former viral infection following an intermittent negative viral DNA test, colposcopy and biopsy (if necessary) were also referred. The (primary) endpoint event was biopsy-confirmed HSIL or invasive cervical cancer within 2 years of follow-up.

Statistics
Two-sided χ2 test/Fisher's exact test and ANOVA/Student's t-test were used to compare enumeration (e.g., age composition, viral genotype composition) and measurement (e.g., p16 INK4A -positive ratio, p16 INK4A increment) data, respectively. The linear relationship between the actual and FCM-detected nominal p16 INK4A -positive ratios as well as that between the viral genotype-specific p16 INK4A increments and multipleinfection-induced variations of genotype-specific p16 INK4A -positive ratios were analysed using Pearson's product-moment correlation coefficient. The changes in p16 INK4A increments between two paired sets of groups of women, for example, HPV-negative versus HPV-positive women (age-group pairs), single infections versus multiple infections (viral genotype pairs) and Pap-normal versus Pap-abnormal women (age-group pairs), were analysed using the Wilcoxon signed-rank test. The cumulative risk of high-grade lesions in p16 INK4A -normal versus -abnormal women was compared using Kaplan-Meier curves and the log-rank test. The independencies of the hazards (presented as hazard ratios [HRs]) of clinicalpathological risk factors (e.g., age, viral genotype, Pap test, p16 INK4A expression status) contributing to the occurrence of high-grade lesions during follow-up were analysed using a univariate or multivariate Cox regression model. The receiver operating characteristic curve was used to determine the optimal cut-offs of the p16 INK4Apositive ratio in diagnosing p16 INK4A -abnormal cases among p16 INK4A expression higher-than-normal (i.e., higher than the average level of p16 INK4A expression) and lower-than-normal populations. SPSS 18.0 software (IBM, Armonk, NY, USA) was used for analyses, and p < .05 was considered statistically significant.

3.1
The p16 INK4A FCM quantification system detected p16 INK4A -positive cells at a threshold as low as 0.1% Clone 1H1087, a mouse anti-human p16 INK4A monoclonal antibody obtained by immunising the animal with E. coli-expressed full-length p16 INK4A protein, was selected as a candidate clone to detect p16 INK4A -positive cells in the modified FCM platform. The p16 INK4A protein-binding ability (specificity and sensitivity) of this antibody was examined by Western blotting. For whole-cell protein extracts of HeLa at a loading amount equal to 104 cells/lane (loading volume: 20 μL), clone 1H1087 showed no nonspecific bands on the PVDF membrane, while the other two clones, 1A72 and 1B517, which were prepared in parallel, detected bands for irrelevant proteins in the lysates of HeLa or HeLa16 cells (note, HeLa16 is a p16 INK4A -knockout strain of HeLa; Figures 1A and S3). The minimal detectable amount of p16 INK4A by clone 1H1087 was equivalent to a protein extract of 10 2 HeLa cells/lane in Western blotting ( Figure 1B). The FCM performance of this clone was further examined with a set of external standards, where HeLa cells were serially diluted with HeLa16 cells. Linear regression analysis indicated that the number of p16 INK4Apositive cells (presented as the 'p16 INK4A -positive ratio' in a 10000-cell sample) detected by p16 INK4A FCM exhibited a perfect linear relationship with the actual numbers (presented as the 'actual p16 INK4A -positive ratio' in Figure 1C) as reference cut-offs were set at 10, 1 and 0.1%, respectively (note, 'FCM-detectable/FCM-detected cells' indicates cells with FITC signal intensities greater than the top 10, 1 and 0.1% of pseudo/autofluorescent cells within a 10 000-cell sample); a similar linear relationship was found between the increments of mean fluorescence intensity (ΔMFIs) measured and the actual numbers of p16 INK4A -positive cells for a series of external standards ( Figure 1C). The regression coefficient r values between the FCM-detected ratios of p16 INK4A -positive (reference-cut-off = 10%), strong-positive (reference-cut-off = 1%) and extremely strong-positive (reference-cut-off = 0.1%) cells and their actual numbers (percentages) reached 0.9961, 0.9944 and 0.9973, respectively, and were highly close to the performance (r = 0.9917) of the ΔMFIs ( Figure 1C). Regarding the minimal detection threshold (or detection resolution) of the clone 1H1087-based FCM, we found that a slight variation of 0.1% or even 0.01% of positive cells (i.e., 1-10 cells) within a 10 000-cell sample was sensitively detected as the reference cut-off was set to 1 or 0.1% ( Figure 1D and Table S1). However, the ΔMFI measurement system could only detect a variation in the positive ratio > 10% (i.e., alteration of the number of positive cells > 1000 cells per 10 000-cell sample). Therefore, compared with ΔMFI, the 'p16 INK4A -positive ratio' is a more sensitive measurement scale ( Figure 1D). Since the minimal detectable variation in the p16 INK4A -positive ratio (i.e., highest resolution) using FCM reached 0.1% at the reference cut-off of 10% (i.e., 10 cells per 10000-cell sample; Figure 1D and Table S1), which satisfied the requirement for the establishment of an effective cervical cancer screening program, we then applied

FCM-detected p16 INK4A -positive cells displayed an age-dependent distributional mode in the normal cervix
We consecutively enrolled 17562 women (Figures 2 and 3A, also Figure S2) who visited the outpatient department for an annual gynaecological examination to explore the age-dependent p16INK4A expression pattern in the normal cervix (see Table S2 for the demographic, clinical and pathological characteristics of the enrolled women). All these women were negative for cervical HPV DNA and Pap tests. At the reference cut-off of 10%, the detected p16 INK4A -positive ratio was highest in women aged 40-49 years (i.e., 14.1 ± 1.7%), while this ratio dropped in women aged 60 years ( Figure 3B and Table S6). The actual number of p16 INK4A -positive cells decreased to approximately 70% of the peak in women aged ≥60 (4.1 vs. 3.1 per 100 sample cells; note, the FCM-detected positive ratio was 13.1 ± 1.7% for women in the age group of 60-69 years; Table S6). Considering the age-related variation of p16 INK4A -positive ratios was prominent and non-negligible, we thereafter adopted age-adjusted criteria to assess p16 INK4A overexpression in exfoliated cervical cells.

Aberrant expression of p16 INK4A in HPV-infected cervical epithelial cells
Women with a positive HPV DNA test and a normal Pap (see Table S2 for the demographic, clinical and pathological characteristics of the enrolled women) were enrolled to explore the effect of HPV infection on p16 INK4A expression (i.e., p16 INK4A -positive ratio) in the cervix. We applied an age-and HPV genotype-matching strategy to build study cohorts (i.e., age groups; Tables 1 and 2), which ensured that the genotypic composition of HPV was not signifi-cantly different between each cohort and thereby avoided genotype-related biases in the collected data (Table S3). In the six age groups enrolled (3596 women in total; see Figure 2), HPV-induced increments of p16 INK4A -positive ratios (i.e., p16 INK4A increment) varied with the age of the women (Figures 3A-C and Table S3). The detected p16 INK4A increment among the HPV-infected women aged 16-19 years was <1% (0.87%, exactly; Table S6), which was slightly higher (p = .001) than that among their HPV-negative counterparts ( Figures 3B and C); however, for women aged 30-69 years, this value increased to >1% (1.22%; Figure 3C) and reached a maximal level of 1.38% in the age group of 40-49 years ( Figure 3C), reflecting substantial harm to the host genome after a long-term HPV infection. Since our age-and genotype-matching strategy also ensured that the populations infected with various viral genotypes shared the same age composition (Tables 2  and S3), we further investigated the age-independent but viral genotype-specific p16 INK4A increments in the enrolled women. Compared with the p16 INK4A -positive ratios detected in the HPV-negative controls (13.9 ± 1.8%; Table S6), women infected with HPV-16 (single infections) were associated with the highest p16 INK4A increments (1.6%; Figure 3D and Table 2), while those infected with HPV-18 or other high-risk (HR) genotypes (single infections) exhibited relatively lower increments (0.5-1.4%; Figure 3D and Tables 2, also Table S3a). For women infected with low-risk (LR) HPVs, the detected increment was further reduced (−0.002 to 1.4%; Figure 3D and Tables 2, also Table S3a). The differences among genotype-specific increments were statistically significant ( Figure 3F and Table 2, also Tables S3 and S6). Notably, the p16 INK4A -positive ratios of HR genotype-related multiple infections were generally lower than those of their singleinfection counterparts ( Figures 3E and F). Although most HR single infections were associated with higher p16 INK4A increments ( Table 2 and Figure 3D), these increments were significantly reduced as women were co-infected with LR genotypes (Table 2 and Figure 3F), where the inhibitory effect was proportional to the increment-inducive effect of the original HR genotype ( Figure 3G). In contrast, and FCM measurements in multiple measuring scales (e.g., ΔMFI and p16 INK4A -positive ratios at reference-cut-off = 10, 1 and 0.1%, respectively). The tested samples were composed of HeLa/Hela16 mixtures, within which the overall number of cells was fixed at 10. 4 The actual percentage of the tested sample was deliberately tuned based on the HeLa/(HeLa+HeLa16) ratio. The linear regression functions and coefficients were given for each pair of measurement parameters compared. (D) The minimal detection thresholds of p16 INK4A FCM according to the measuring scales. Each FCM measurement for a given percentage of p16 INK4A -positive cells (mimicked using a HeLa/Hela16 mixture) was compared with that of a HeLa/(HeLa+HeLa16) ratio = 0 (p16 INK4A -negative) sample. The experiments were performed in triplicate, and a two-sided Student's t-test was used for comparison. The minimal detection threshold conferred by a measuring scale was determined by the minimal number of p16 INK4A -positive cells that could be detected by FCM to give a significantly different measurement (compared with a p16 INK4A -negative sample) using the indicated scale. The overall number of cells was fixed at 10 4 for each sample tested (i.e., HeLa/HeLa16 mixture). *, p < .05; **, p < .01; ***, p < .001.

F I G U R E 2
The research profile of two-tier clinical validation work for p16 INK4A FCM. We performed two tiers of clinical investigations on the diagnostic and prognostic efficacies of p16 INK4A FCM. First, 24 100 women with different HPV infection statuses and Pap testing results were enrolled from ten medical centres nationwide. Cross-sectional diagnostic studies were conducted and the diagnostic efficacies of p16 INK4A FCM on histological HSIL+ were determined by comparing its performances with a gold standard technique: colposcopy-guided biopsy. Shown were the p16 INK4A FCM detection abnormal rates in women with specific histopathological categories as confirmed by biopsy. The optimal detection criterion was obtained by calculating the maximal Youden's index under various cut-off-ratios of p16 INK4A FCM. Second, the prognostic efficacies were validated in three cervicopathological conditions, namely, HPV-positive Pap-normal, Pap-abnormal biopsy-negative and biopsy-confirmed LSIL. The cohorts for performing the corresponding 2-year prospective observational studies were collected from the 24 100 women in initial cross-sectional diagnostic studies. Women, who have completed colposcopy examinations with biopsy pathological results meeting the criteria, were subenrolled into the corresponding follow-up cohort. hLSIL, histological LSIL. hHSIL, histological HSIL. p16 INK4A Abn., the percentage of cases with abnormal p16 INK4A FCM detection results within a specific women population.
LR HPVs evoked a lower increment in single-infection statuses but gained significantly higher promotions in the p16 INK4A increment after co-infection with HR genotypes (Figures 3D, F and G).

Aberrant expression of p16 INK4A in pathomorphologically abnormal cervical epithelial cells
Women with abnormal Pap results (in either HPV-positive or HPV-negative conditions; see Figure 2) were enrolled to test the quantitative relationship between p16 INK4Apositive ratios and Pap pathomorphological changes as well as to evaluate the performance of p16 INK4A FCM in diagnosing neoplastic lesions (using colposcopy-guided biopsy as a gold standard). According to the age-and HPV infection-dependent distributional patterns of the aforementioned 17562 (HPV-negative, NILM) and 3596 cases (HPV-positive, NILM), age-and viral genotypematched women with abnormal Pap tests were assessed and comprised each study cohort based on TBS categories (see Tables S2, S4 and S5 for the demographic, clinical and pathological characteristics of the enrolled women). The NILM cohorts were used as negative controls to scale the lesion-specific p16 INK4A increments in HPVpositive and HPV-negative conditions. The FCM-detected p16 INK4A -positive ratios were significantly higher in Papabnormal women than in NILM women (NILM women vs. non-NILM women: 13.9 ± 1.8 vs. 17.7 ± 5.0-21.4 ± 7.2% F I G U R E 3 The age-and viral genotype-dependent expression of p16 INK4A in the Pap-normal condition. (A) The age compositions of the HPV-negative and HPV-positive NILM women, which were compared using the two-sided χ2 test. (B) The age-dependent expression of p16 INK4A in the HPV-negative and HPV-positive NILM women, which was analysed by ANOVA among the respective populations. For each age group, the expression levels (positive ratios) of p16 INK4A were compared between HPV-negative and HPV-positive women using the two-sided Student's t-test. (C) The p16 INK4A increments of each age group, which were compared using the Wilcoxon signed-rank test. (D) The viral genotype-dependent expression pattern of p16 INK4A in HPV-positive NILM women, which was analysed using ANOVA. The expression level of p16 INK4A (i.e., p16 INK4A -positive ratio) in women infected with a specific viral genotype was seriatim compared with the average p16 INK4A level of HPV-negative NILM women using the two-sided Student's t-test. The dotted line signifies the average level of p16 INK4A in an HPV-negative NILM population. (E) The age compositions of the women with single and multiple infections in the HPV-positive NILM population, which were compared using the two-sided χ2 test. (F) The viral genotype-specific p16 INK4A increments of multiple infections relative to their respective single-infection counterparts. The overall difference in p16 INK4A increments between single and multiple infections was compared using the Wilcoxon signed-rank test. The intragenotypic p16 INK4A increment difference was compared using the two-sided Student's t-test between cases with single and multiple infections. (G) The relationship between the genotype-specific p16 INK4A increments of single infections relative to the average p16 INK4A level of HPV-negative NILM women and the genotype-specific p16 INK4A increments of multiple infections relative to the corresponding single-infection counterparts was analysed using Pearson's product-moment correlation coefficient (the dotted line). The linear regression coefficient r and p value are given. HR HPVs are highlighted in red characters. *, p < .05; **, p < .01; ***, p < .001.   (27) 188 (14) 28 (2)   were compared seriatim with that of the HPV-negative women. The two-sided χ2 test was used for the analysis. Additionally, for the age groups, the viral genotyperelated distributional characteristics of the infected women were also compared with one another using the two-sided χ2 test. All the differences compared in this table were of no statistical significance; details of the statistical analyses can be found in Table S3. c Data are presented as the mean ± SD. The viral genotype-related p16 INK4A increment was calculated based on the FCM data of those with single infections.
The p16 INK4A -positive ratios of the women infected with HPV-16, -18, HR HPVs and LR HPVs were seriatim compared with that of the women with no viral infections using the two-sided Student's t-test (see Figure 3D). The genotype-specific p16 INK4A increments and their related standard deviations (SD) were given.  Table S6). For each TBS category, an age-dependent distributional pattern of p16 INK4A increment was observed (Table S6), while the age group with peak increment varied among the cohorts (Figures 4A and B). For HPV-positive Pap-abnormal women, the average increments of p16 INK4A -positive ratios were relatively higher in those with ASC-US/-H conditions (6.0 and 6.3%, respectively) than in those with HSILs (5.6%) and were lowest in those with LSILs (4.1%; Figure 4C and Table S6). However, for women with negative HPV DNA tests, although the p16 INK4A increments were higher in those with ASC-US/-H (5.5 and 6.4%, respectively) than in those with LSILs (3.9%), the detected increments reached the highest (7.5%) in those with HSILs, justifying the utility of p16 INK4A as a signature for triaging women with high-grade lesions in this subsituation ( Figure 4C). We then explored the possible factors contributing to the paradoxical increments observed in HPV-positive ASC-US/-H cases (Table S6). The scatter plot revealed that HPV-positive high-grade lesions (i.e., HSILs) were more frequently associated with lower-than-normal levels of  Figure 4D), and the number of these cases increased with the severity of Pap abnormalities ( Figure 4D), leading to a decreased average p16 INK4A increment in LSIL/HSIL cases ( Figures 4C  and D). To evaluate the performance of p16 INK4A FCM for predicting biopsy results of the subsequent colposcopy, we adopted age-adjusted p16 INK4A -positive ratios (normalised to the age group of 40-49 years) and applied a double-cut-off-ratio criterion (Table 3 and Figure 5). The optimal cut-offs were 18.0% for p16 INK4A overexpression cases and 11.4% for lower-than-normal cases (both related to a maximal Youden's index in their respective populations according to the definition described in section Methods and Figures 5A and B) FCM criterion was adopted, the optimised cut-offs of the p16 INK4A -positive ratio were 18.3% (upper cut-off) and 11.9% (lower cut-off) for HPV-positive cases and 18.6% (upper cut-off) and 11.2% (lower cut-off) for HPV-negative cases, by which a maximal Youden's index of 0.78 was obtained, superior to that of the HPV DNA and Pap co-test by 5.6% (Table 3 and Figures 5C-F). In total, 90.5% of HPV DNA test-unidentified and 100% of Pap-unidentified HSIL cases were recognised by p16 INK4A FCM using the HPV DNA-combined criterion (Tables 3, also Tables S6 and S7).

FCM-based p16 INK4A -positive ratios predicted 2-year pre-cancer/cancer risks in women with HPV infections and/or Pap abnormalities
There were both HPV-positive and HPV-negative women (1428 in total) with abnormal p16 INK4A -positive ratios who had negative colposcopy-guided biopsy results (including 68 women with biopsy-negative Pap-normal results and 1360 women with biopsy-negative Pap-abnormal results). Additionally, there were women with biopsy-confirmed LSILs (465 in total) who were followed up to 2 years as required by the American Society of Colposcopy and Cervical Pathology (ASCCP) guidelines. Therefore, we conducted a prospective study of these women to investigate whether they would run a higher risk of HSIL+ (i.e., HSIL and worse lesions, see Figure 2) in 2 years compared with their p16 INK4A -normal counterparts (i.e., 11.2-18.6% in the HPV-negative condition or 11.9-18.3% in the HPV-positive condition). As expected, for all three conditions, namely, the HPV-positive Pap-normal (NILM) condition, Pap-abnormal biopsy-negative condition and biopsy-confirmed LSIL condition, women with an abnormal p16 INK4A status exhibited a significantly higher incidence rate of histological (i.e., biopsy-confirmed) HSILs (Tables 4 and S8). In the cohorts studied, abnormal p16 INK4A status was consistently an independent prognostic determinant compared with other risk factors, namely, HPV DNA status, viral genotype, preceding Pap test and age at diagnosis, whose HRs remained at high levels (4.3-7.2-fold relative to the '1.0' reference) and were statistically significant in multivariate analyses (Tables 5-7). During the follow-up period, for p16 INK4A -abnormal cases within HPV-positive NILMs and Pap-abnormal biopsynegative (subsituation: HPV-positive) cohorts, their cumulative risks (PPVs at year 2: 13.2-15.6%) of histological HSILs exceeded the ASCCP-recommended threshold for colposcopy referral (i.e., immediate risk of CIN3+ ≥4%), while in the biopsy-confirmed LSIL cohort HPV-positive subsituation, this risk was further doubled for p16 INK4Aabnormal women (21.3% at year 1, 29.3% at year 2; Figure 6 and Table 5, also Table S8), suggesting the necessity of early intervention for these women (considering that the threshold recommended by ASCCP for immediate treatment was an immediate risk of CIN3+ ≥25%). In our observation, only the women with HPV-negative biopsy LSIL and Pap-abnormal biopsy-negative status with normal p16 INK4A expression displayed significantly lowered cumulative HSIL risks (PPVs at year 2: 0-0.7%), which were close to the ASCCP-recommended '1-year return' threshold (i.e., 5-year CIN3+ risk ≥0.55%) in 2 years per the HPV DNA and Pap test results. For each dataset, the Wilcoxon signed-rank test was used for statistical analysis. (C) The lesion-related (classified based on TBS terms) expression patterns (positive ratios and increments) of p16 INK4A , which were compared using ANOVA. Blue bar, the statistical analysis was performed only for HPV-negative women. Brown bar, the statistical analysis was performed only for HPV-positive women. Black bar, the statistical analysis was performed for all Pap-abnormal women. The dotted line, the average p16 INK4A level (positive ratio) of the HPV-negative NILM population. (D) The lesion-related distribution patterns of p16 INK4A -positive ratios under different HPV DNA and Pap-abnormal situations. The two-sided χ2 test was performed to analyse the distribution patterns in the HPV-negative and HPV-positive conditions. Notably, in the HPV-negative condition, there were more p16 INK4A lower-than-normal events found in women with low-grade lesions, while in the HPV-positive condition, there were more p16 INK4A lower-than-normal events found in women with high-grade lesions; both distribution patterns were of statistical significance. SD, standard deviation. TA B L E 3 Diagnostic efficacy of p16 INK4A FCM for colposcopy referral and its comparison with known HSIL+-triaging strategies.  ( Table 5). However, for HPV-positive women, either in the Pap-normal or Pap-abnormal cohorts (including two subsituations: biopsy-negative and biopsy-confirmed LSIL), we observed that the cumulative HSIL risks of women with a p16 INK4A -normal status persistently ascended and crossed the 1-year return threshold at year 1 (PPVs at year 1: 1.4-2.1%) while remaining below the colposcopy referral threshold at year 2 (PPVs at year 2: 3.1-3.8%, Figure 6 and Table S8). Hence, differences in 2-year outcomes between p16 INK4A -normal and p16 INK4A -abnormal women should be mainly attributed to prognostic performances of p16 INK4A among HPV-positive women in the three cohorts studied ( Figure 6).

DISCUSSION
For cervical cancer, the implication of p16 INK4A as a qualitative pathological signature has been fully documented. Previously, pathological and clinical guidelines and/or     instructions were issued for labelling p16 INK4A on cytological (Pap) or histological (biopsy) slides with proper immunostaining techniques as well as for evaluating the severity of pre-cancerous/cancerous lesions (CIN1, CIN2/3 and invasive cancer) with qualitative criteria of p16 INK4A . [9][10][11][12][30][31][32] Our current work extended p16 INK4A pathology to women with premorbid, ambiguous or transitional pathological statuses of cervical lesions (i.e., HPV-positive Pap-normal, biopsy-negative Pap-abnormal and biopsy-confirmed LSIL), which enabled both immediate and short-term (2-year) risk assessments of high-grade lesions in more specified and difficult clinical settings. Our data showed that p16 INK4A FCM (an in vitro diagnostic device, IVD) can be used to quantitatively and precisely determine the cervical p16 INK4A expression level, by which both age-dependent physiological and infection-induced aberrant expression of p16 INK4A can be measured; therefore, the related intracellular molecular changes upon HPV infection (as Pap is still normal) can be surveilled. These findings justify the utility of p16 INK4A -based quantitative pathology for detecting/predicting cervical carcinogenesis and directing early interventions (e.g., colposcopy, physical ablation, loop excision) among high-risk women.
The history of p16 INK4A as an immunostaining signature for cervical pre-cancer/cancer lesions has been 20 years. 9 p16 INK4A was originally proposed to eliminate discrepancies between pathologists as they interpreted Pap or biopsy findings into TBS-categorised terms. As the initial inventors of the p16 INK4A immunostaining technique, Klaes and his colleagues observed intensive and diffuse immunohistochemical staining of p16 INK4A in 100% (60 out of 60) of cases of HSIL and 97% (58 out of 60) of cases of invasive cancer. 9 The authors therefore believed that this signature would ensure a significantly lowered misdiagnosis rate for detecting histological high-grade lesions and a minimised overdiagnosis rate in inflammatory/hyperplastic conditions during microscopic examination. However, it was not until Bibbo et al., who developed a specific immunostaining protocol for securing overexpressed p16 INK4A to be labelled on thin-layer, liquid-based Paps, that the qualitative correlation between immunocytochemical staining of p16 INK4A and histological HSIL+ lesions was formally established. 33 Despite its practical significance, p16 INK4A immunocytochemistry is still a labourintensive and skill-demanding technique that requires pathologists to carefully discern truly stained cells from false-positive cells under a nonspecific immunostaining TA B L E 6 Univariate and multivariate analyses of the prognostic factors influencing the 2-year HSIL+ outcomes of Pap-abnormal biopsy-negative women (Cohort 2)  background. Moreover, per Bibbo's approach, the semiquantitative count of true-positive cells is a prerequisite for predicting HSIL+ lesions. Sahebali et al. subsequently created a holo-quantitative scale for p16 INK4A -based detection of HSILs by counting the exact number of positive cells in a Pap, which obtained a 21.2% PPV at the 95% detection sensitivity. 10 Notably, both Bibbo's and Sahebali's algorithms implicated a default principle, that is, the number (or ratio) of p16 INK4A -positive cervical cells should be proportional to the risk of high-grade lesions. This principle can be summarised as a 'pyramid theory' with an emphasis that the cervical pre-cancer/cancer should originate from a vast number of morphologically normal but molecularly altered cells (e.g., p16 INK4A overexpression).

2-year outcomes
In this study, we applied this principle to triage potential lesions; nevertheless, our technique avoided the burden of manually counting p16 INK4A -positive cells under a microscope. The data we obtained were quite objective due to the nature of machine-reading and automatic analysis of p16 INK4A FCM, which were also improved by the addition of external standards to calibrate this detection system (see section Methods). We, therefore, obtained a better diagnostic performance with p16 INK4A with a much higher Youden's index (0.78) compared with earlier works (0.26-0.56, regardless of HPV DNA status). 10,[33][34][35][36] In addition, compared with the lower PPVs of previous studies, [33][34][35][36] which were between 16.6 and 22.0%, our system exhibited a significantly improved HSIL risk-assessing ability with a PPV at 25.0% among HPV-positive women (Tables S7c), reflecting the importance of the precise quantification of p16 INK4A -positive cells rather than microscopically assessing their morphology. 10 As a critical finding of this study, we observed that the physiological expression of p16 INK4A in the cervix varied with the woman's age. The overall expression of p16 INK4A was significantly lower in women aged 20-29 TA B L E 7 Univariate and multivariate analyses of the prognostic factors influencing the 2-year HSIL+ outcomes of biopsy-confirmed LSIL women (Cohort 3) and 50-59 years than in women aged 30-39 and 40-49 years, suggesting that normal levels of female hormones (oestrogen, progesterone) might be essential for maintaining the physiological expression of p16 INK4A , while for women at the menopausal period, diminished proliferation activity of the cervical epithelium was associated with decreased p16 INK4A expression (e.g., ≥50 years of age; Figure 3B). To our knowledge, this phenomenon has never been reported before, especially in studies that employed traditional immunochemical techniques. . We therefore adopted age-adjusted p16 INK4A criteria and achieved an improved PPV for HSILs (Table 3). Moreover, we observed age-dependent p16 INK4A increments after HPV infection. Our data showed that p16 INK4A increments in the HPV-positive condition also varied with ages and viral genotype. Interestingly, the variations in p16 INK4A increments we detected could be used to support several known molecular theories of HPV infection. (1) First, in previous reports, the 5-year cumulative incidence of HSIL+ presented a first peak at 11% in women aged 40-44 years and a second at 10% in women aged 60-64 years. [38][39][40][41] We similarly found a major peak of p16 INK4A increment in women aged 40-49 years and a minor peak in women aged 60-69 years ( Figures 3B and C). In addition, in the literature, the HPV infection statuses of both age groups remained stable from the initial baseline investigation throughout the follow-up period. 40 Hence, the degree of p16 INK4A increments should be more closely associated with the occurrence of HSIL+ lesions than the persistence of HPV infection per se, reflecting the exacerbation of the internal tumour suppressive environment in age-related high-risk women.

2-year outcomes
(2) Second, we found that women in the 20-29 years age group, who were reported to have the lowest risk of HSIL+ lesions, had a minimal increase in p16 INK4A after HPV infection. Additionally, current clinical experiences have indicated that HPV infections scarcely cause severe neoplastic changes in the cervixes of women aged 20-24 years. 38,39 We therefore confirmed that the carcinogenetic capability of HPVs could be measured by their ability to induce p16 INK4A dysregulation. (3) Third, after reviewing the literature, we noted that the peak serum oestrogen and progesterone concentrations emerge at ages 40-49 years, 42 which was synchronised with peak p16 INK4A positive ratios/increments in the observed age group in our study ( Figure 3 and Table S6). This phenomenon implies that sex hormones might play a role in accelerating cervical neoplasia formation. However, this inference does not apply to women aged >60 years, who accounted for nearly 20% of cervical cancer cases in previous reports. 39,40 More risk factors need to be weighed in this age group. Per recent advances in aetiologies of cervical cancer, the long-term infection-caused viral integration and aftermath disarrangement of the host genome could be an attributable reason. [43][44][45][46] In our work, this possibility has been partially reflected in the fact that the infectioninduced p16 INK4A increment peaked again in women aged 60-69 years (Figures 3B and C and Table S6b).
Our study described the practical performance of p16 INK4A -based quantitative pathology in identifying highgrade cervical lesions. Similar to earlier studies focused on the application of the HPV DNA and p16 INK4A immunocytochemical co-test for triaging women with higher risks of cervical cancer, [11][12][13][14][15] we verified that the combination of p16 INK4A FCM and HPV DNA tests could be superior to the traditional HPV DNA and Pap co-test in screening of HSILs with regard to their respective diagnostic parameters, namely, sensitivity (89.5 vs. 86.0%), specificity (88.5 vs. 86.4%), PPV (13.6 vs. 11.3%) and NPV (99.8 vs. 99.7%, respectively; Table 3). Moreover, we gained several lines of evidence that could influence current opinions on the molecular mechanisms of HPV persistent infection and carcinogenetic activity. (1) First, through p16 INK4A FCM, we established a statistical correlation between viral genotypes and p16 INK4A -positive ratios (or increments), where even a nuance of positive ratio/increment between single and multiple infections could be discriminated (Table  S1). The data indicated that HPV-16 and HPV-18, the two most dangerous genotypes, induced the highest overexpression of p16 INK4A , while the less dangerous HR and LR genotypes were associated with insufficient overexpression ( Figure 3D). This finding corroborated previous works of Sahebali et al. 10 and Sano et al. 47 Sano et al. 47 reported that strong staining of p16 INK4A could be predictive of HR HPV infection of the cervix (sensitivity 84%, specificity 98%, PPV 97%, NPV 86%). Sahebali et al. 10 found that HPV-16 induced a significantly higher number of p16 INK4A overexpression events than did other HR genotypes, while LR HPVs caused the lowest p16 INK4A overexpression. Therefore, both previous reports and our own data pointed to a common fact that p16 INK4A , as a downstream element of RB, can be used to measure the harm an HPV infection could do to the host cells. The hazards of each HPV genotype, therefore, can be ranked per the p16 INK4A increments they have induced. However, we have other profound findings. We observed that an extra infection of less dangerous genotypes can interfere with the carcinogenetic capacity of a more dangerous genotype ( Figure 3F and Table 2). Our epidemiological data also indicated that the immediate risk (i.e., PPV) of HSIL in women with multiple infections might be lower than that of women with a single infection of an HR HPV genotype (Table S7d). Similar findings have been reported by earlier studies, implying that multiple infections per se are not a pure accelerating factor for carcinogenesis. [48][49][50][51] (2) Second, we observed that for HPV-negative women, those with HSILs had the highest p16 INK4A increments, whereas in HPV-positive women, those with ASC-US/-H exhibited the highest p16 INK4A increments. Thus, a question was raised: Why does the performance of p16 INK4A in diagnosing HSILs become less sensitive in HPV-positive women? Notably, Murphy et al. 52 also reported that the intensity of p16 INK4A immunostaining was decreased in HPV-positive women with histological HSILs (i.e., CIN2+), and in two of such cases, p16 INK4A immunochemistry was even negative. Similar p16 INK4A -negative HSIL+ cases have also been reported in other studies. 16 For our own work, although a >90% sensitivity for p16 INK4A FCM in detecting HPVpositive HSILs has been ensured after adopting a lower cut-off ratio ( Figure 5), more histologically normal or LSIL cases could be incorrectly diagnosed (triaged) as highgrade lesions, and thus, these patients undergo excessive colposcopy, compromising the PPV of p16 INK4A . A potential explanation for these paradoxical phenomena could be the fact that most women with p16 INK4A -negative HSILs had a lower-than-normal level of p16 INK4A expression (defined as 'extremely low expression'). The existence of these cases lowered the overall expression level of p16 INK4A in HSIL cases and hence influenced the diagnostic accuracy of p16 INK4A FCM, especially when a single-cut-offratio strategy was applied. Previously, a study conducted by Nuovo et al. 53 indicated that aberrant promoter hypermethylation of the p16-encoding gene CDKN2A is also a molecular trait of cervical neoplasia, which could occur at a very early stage of HPV infection. By this point, there should be two forces co-propelling an initial lesion to a later worse lesion. One is the cell cycle dysregulation induced by the viral oncogenes E6 and E7, which has been manifested by p16 INK4A overexpression per se; the other is CDKN2A promoter hypermethylation, which could be reflected by the extremely low expression of p16 INK4A . The latter produces a distinctive population of neoplastic lesions associated with a poor prognosis. 53 In our study, we identified a group of women with HSILs and extremely low p16 INK4A expression by FCM-based detection (note, transformation zone-or sampling-related factors have been excluded; see Tables S6e and S6f). These cases could not be easily identified by traditional qualitative p16 INK4A techniques, such as immunochemistry, which could have been judged as negative. However, as the double-cut-off-ratio criterion (i.e., p16 INK4A -positive ratios >18.0% or < 11.4%) was adopted, the sensitivity and Youden's index of p16 INK4A FCM were improved by 4.3% and 0.04, respectively (relative to the single-cut-off-ratio strategy; Table 3). Therefore, unlike immunochemistry, FCM offered an opportunity to detect both overexpression and extremely low expression of p16 INK4A . Future efforts to apply advanced analytic tools, for example, support vector machine (SVM), should be focused on creating a fitter nonlinear algorithm for p16 INK4A to predict more carcinogenesis events.
Our study investigated the prognostic value of p16 INK4A FCM in Pap-/biopsy-normal (or LSIL) populations. The roles of p16 INK4A in predicting women with cervicopathologically normal or low-grade abnormal status are of interest to gynaecological clinicians worldwide. However, the outcomes of p16 INK4A -abnormal populations with different initial situations of HPV DNA, Pap and/or biopsy were not clear. Before 2008, Carozzi et al. 11,12 conducted a ran-domised controlled study (i.e., NTCC study) to investigate the efficacy of p16 INK4A in triaging biopsy-negative/LSIL women threatened with increased risks of high-grade lesions during 3-year follow-up. The design of the NTCC study was basically consistent with ours with a few exceptions: (1) in the NTCC study, only HPV-positive women were referred to colposcopy, whereas in our work, both those with HPV infections and those with abnormal Pap results (even if HPV-negative) underwent colposcopy; and (2) in the NTCC study, CIN2+ (i.e., histological HSIL+) cases detected by initial colposcopy and biopsy were mixed with those diagnosed during follow-up for calculating p16 INK4A predictive values; however, in our study, cumulative risks of HSILs were calculated independent of initial colposcopy/biopsy examination results and were studied separately for three particular conditions (i.e., HPVpositive Pap-normal, biopsy-negative Pap-abnormal and biopsy-confirmed LSIL), providing more detailed information for p16 INK4A -based prognosis. Carozzi et al. 11,12 indicated that p16 INK4A immunocytochemistry could improve both immediate and short-term assessments for CIN2/3+ risks (sensitivity 91%, specificity 59%, PPV 20%, NPV 95%), and the calculated relative cumulative 3-year risk for histological CIN2/3+ (HSIL+) lesions was 3.74 in p16 INK4Apositive women compared with negative controls. Later, in 2012, Wentzensen et al. 13 performed a similar study in HPV-positive Pap-normal women by using a p16 INK4A /Ki-67 dual-staining technique. 13,14 The authors supposed that an additional Ki-67 counterstaining could ensure detected positive cells to be strictly epithelial cells but not contaminated inflammatory cells so that the p16 INK4A diagnostic specificity could be enhanced. They finally obtained a 5-year risk assessment performance by dual-staining for HSIL+ lesions with sensitivity 83%, specificity 59%, PPV 21% and NPV 96%, which were very close to the parameters reported by Carozzi et al., 12 implying a minimal contribution of Ki-67 to the PPV and NPV improvements of p16 INK4A prognosis (approximately 1%). In our study, the 2-year HSIL-risk predictive performances of p16 INK4A differed with study cohorts. The sensitivity (100%) and NPV (100%) of p16 INK4A FCM reached the highest levels in the HPV-negative biopsy-confirmed LSIL women, while the highest specificity (98.2%) was found in the HPV-positive NILM women. Furthermore, in the biopsy-confirmed LSIL cases with positive HPV DNA tests, the PPV (i.e., 2-year cumulative risk) of p16 INK4A -abnormal cases reached its highest level (29.3%), implying this particular population of women with LSILs had a much greater risk of HSIL+ than had been suggested by the ASCCP guidelines (i.e., 2.8-6.5%), 30,31 which, therefore, denied the rationality of the expectant management for these LSIL patients. Taken together, our study provided important evidence to support the prognostic significance of p16 INK4A to be interpreted on an individualised basis where both viral infection history and Pap/biopsy status need to be co-weighed. This has mirrored the spirit of the 2019 version of the ASCCP guidelines. 31 The main strengths of this study include the following: (1) the establishment of a novel p16 INK4A detection system, allowing the quantitative depiction of the expression level and pathological increment of p16 INK4A in cervical epithelial cells; (2) a large population-based p16 INK4A quantification database, revealing the age-and viral genotypedependent expression of p16 INK4A in the cervix, necessitating an age-and HPV DNA test-adjusted p16 INK4A diagnostic criterion to be made; (3) an improved performance in detecting/predicting high-grade lesions in the cervix, allowing precise colposcopy referral and early intervention among high-risk women; and (4) insights into the relationships between p16 INK4A quantification and the ASCCP guidelines, promoting evidence-based decisions for treating women with ambiguous HPV and/or Pap results.
This study has limitations. (1) Due to the need to evaluate age-and HPV genotype-specific expression of p16 INK4A , we applied an age-and genotype-matching strategy to enrol the study populations. Although HPVnegative Pap-normal cases were collected consecutively, the matched HPV-positive Pap-normal/abnormal cases were semi-objectively enrolled, which might disrupt their natural time-line continuity. (2) We did not determine the association between infection titre and p16 INK4A increment. Although the viral infection titres could be detected via quantitative PCR (qPCR), there are currently no United States Food and Drug Administration-approved HPV DNA qPCR detection kit available worldwide. (3) No HPVnegative Pap-normal women were referred for colposcopy, and a few women with HPV-negative HSIL might be misdiagnosed. However, considering that the number of such cases could be extremely low and the reported incidence was <0.04%, 11,12 our main findings, including the detection sensitivity and PPV of p16 INK4A FCM, should not be affected.
In summary, our work has developed the traditional immunochemistry-based p16 INK4A qualitative pathology into a quantifiable and interlaboratory cross-verifiable IVD, which was implemented through an FCM platform. This technological innovation built a bridge for pathologists to reach a more objective and consistent opinion on the nature of Pap abnormalities and inform clinicians of the molecular stage of HPV infection(s). The same technique can be applied to other known immunochemical signatures with clear clinical or pathological significance, such as Ki-67, hTERT, EGFR, HER2 and ALK, which could provide more precise and quantitative information for pre-therapy evaluation and post-therapy surveillance.

A C K N O W L E D G E M E N T S
This work was supported by grants from the Science and Technology Commission of Shanghai Municipality (Nos. 15441905700, 15DZ1940502 and 17441908000) and the Natural Science Foundation of China (No. 81572548). We thank Ms. Lijun Cai (M.S.) and Yueling Liu (B.M. and M.B.A.) for their kind financial help to support this work.

C O N F L I C T O F I N T E R E S T S TAT E M E N T
The authors declare no conflicts of interest.